AITopics | laplace kernel

Collaborating Authors

laplace kernel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Finite-Particle Convergence Rates for Conservative and Non-Conservative Drifting Models

Balasubramanian, Krishnakumar

arXiv.org Machine LearningMay-26-2026

We propose and analyze a conservative drifting method for one-step generative modeling. The method replaces the original displacement-based drifting velocity by a kernel density estimator (KDE)-gradient velocity, namely the difference of the kernel-smoothed data score and the kernel-smoothed model score. This velocity is a gradient field, addressing the non-conservatism issue identified for general displacement-based drifting fields. We prove continuous-time finite-particle convergence bounds for the conservative method on $\R^d$: a joint-entropy identity yields bounds for the empirical Stein drift, the smoothed Fisher discrepancy of the KDE, and the squared center velocity. The main finite-particle correction is a reciprocal-KDE self-interaction term, and we give deterministic and high-probability local-occupancy conditions under which this term is controlled. We keep the quadrature constants explicit and track their possible bandwidth dependence: the root residual-velocity rate $N^{-1/(d+4)}$ holds under an additional $h$-uniform quadrature regularity condition, while a more general growth condition yields the optimized root rate $N^{-(2-β)/(2(d+4-β))}$, where $0\le β<2$. We also analyze the non-conservative drifting method with Laplace kernel, corresponding to the original displacement-based velocity proposed in Deng et al., 2026 (arxiv:2602.04770). For this method, a sharp companion kernel decomposes the velocity into a positive scalar preconditioning of a sharp-score mismatch plus a Laplace scale-mismatch residual, producing an analogous finite-particle rate with an unavoidable residual term. Finally, we explain how the continuous-time residual-velocity bounds translate into one-step generation guarantees through the explicit drift size $η$.

artificial intelligence, kernel, machine learning, (16 more...)

arXiv.org Machine Learning

2605.22795

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)

Add feedback

OntheSimilaritybetweentheLaplace andNeuralTangentKernels

Neural Information Processing SystemsFeb-18-2026, 22:31:28 GMT

Finally, we provide experiments on real data comparing NTK and the Laplace kernel, along with a larger class ofγ-exponential kernels. We show that these perform almost identically.

artificial intelligence, kernel, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

1006ff12c465532f8c574aeaa4461b16-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 12:07:49 GMT

eigenvalue, kernel, ntk, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

On the Similarity between the Laplace and Neural Tangent Kernels

Neural Information Processing SystemsDec-23-2025, 18:31:58 GMT

Recent theoretical work has shown that massively overparameterized neural networks are equivalent to kernel regressors that use Neural Tangent Kernels (NTKs). Experiments show that these kernel methods perform similarly to real neural networks. Here we show that NTK for fully connected networks with ReLU activation is closely related to the standard Laplace kernel. We show theoretically that for normalized data on the hypersphere both kernels have the same eigenfunctions and their eigenvalues decay polynomially at the same rate, implying that their Reproducing Kernel Hilbert Spaces (RKHS) include the same sets of functions. This means that both kernels give rise to classes of functions with the same smoothness properties. The two kernels differ for data off the hypersphere, but experiments indicate that when data is properly normalized these differences are not significant. Finally, we provide experiments on real data comparing NTK and the Laplace kernel, along with a larger class of $\gamma$-exponential kernels. We show that these perform almost identically. Our results suggest that much insight about neural networks can be obtained from analysis of the well-known Laplace kernel, which has a simple closed form.

laplace and neural tangent kernel, name change, neural tangent kernel, (8 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)

Add feedback

Predicting kernel regression learning curves from only raw data statistics

Karkada, Dhruva, Turnbull, Joseph, Liu, Yuxi, Simon, James B.

arXiv.org Artificial IntelligenceOct-17-2025

We study kernel regression with common rotation-invariant kernels on real datasets including CIFAR-5m, SVHN, and ImageNet. We give a theoretical framework that predicts learning curves (test risk vs. sample size) from only two measurements: the empirical data covariance matrix and an empirical polynomial decomposition of the target function $f_*$. The key new idea is an analytical approximation of a kernel's eigenvalues and eigenfunctions with respect to an anisotropic data distribution. The eigenfunctions resemble Hermite polynomials of the data, so we call this approximation the Hermite eigenstructure ansatz (HEA). We prove the HEA for Gaussian data, but we find that real image data is often "Gaussian enough" for the HEA to hold well in practice, enabling us to predict learning curves by applying prior results relating kernel eigenstructure to test risk. Extending beyond kernel regression, we empirically find that MLPs in the feature-learning regime learn Hermite polynomials in the order predicted by the HEA. Our HEA framework is a proof of concept that an end-to-end theory of learning which maps dataset structure all the way to model performance is possible for nontrivial learning algorithms on real datasets.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.14878

Country: North America > United States (0.92)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

$L_2$-Regularized Empirical Risk Minimization Guarantees Small Smooth Calibration Error

Fujisawa, Masahiro, Futami, Futoshi

arXiv.org Machine LearningOct-16-2025

Calibration of predicted probabilities is critical for reliable machine learning, yet it is poorly understood how standard training procedures yield well-calibrated models. This work provides the first theoretical proof that canonical $L_{2}$-regularized empirical risk minimization directly controls the smooth calibration error (smCE) without post-hoc correction or specialized calibration-promoting regularizer. We establish finite-sample generalization bounds for smCE based on optimization error, regularization strength, and the Rademacher complexity. We then instantiate this theory for models in reproducing kernel Hilbert spaces, deriving concrete guarantees for kernel ridge and logistic regression. Our experiments confirm these specific guarantees, demonstrating that $L_{2}$-regularized ERM can provide a well-calibrated model without boosting or post-hoc recalibration. The source code to reproduce all experiments is available at https://github.com/msfuji0211/erm_calibration.

artificial intelligence, kernel, machine learning, (19 more...)

arXiv.org Machine Learning

2510.1345

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

the paper be accepted

Neural Information Processing SystemsOct-2-2025, 02:00:42 GMT

Regarding our proof techniques, the proof in Thm. 1 for NTK with two layers and bias borrows techniques from [6]. Our proof technique for deep networks uses the algebra of RKHSs and is therefore novel in this context. Thm. 2 derives bounds that result from the relation between the Fourier expansion of the Laplace kernel in NTK (established in Thm. 4) and identifying the spaces fixed under the appropriate integral transform. "why they need additional parameters a, b, c." We note that analogously NTK becomes sharper for deeper networks.

artificial intelligence, kernel, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Convergence of Mean Shift Algorithms for Large Bandwidths and Simultaneous Accurate Clustering

Pal, Susovan, Vepakomma, Praneeth

arXiv.org Machine LearningJun-25-2025

The mean shift (MS) is a non-parametric, density-based, iterative algorithm that has prominent usage in clustering and image segmentation. A rigorous proof for its convergence in full generality remains unknown. Two significant steps in this direction were taken in the paper \cite{Gh1}, which proved that for \textit{sufficiently large bandwidth}, the MS algorithm with the Gaussian kernel always converges in any dimension, and also by the same author in \cite{Gh2}, proved that MS always converges in one dimension for kernels with differentiable, strictly decreasing, convex profiles. In the more recent paper \cite{YT}, they have proved the convergence in more generality,\textit{ without any restriction on the bandwidth}, with the assumption that the KDE $f$ has a continuous Lipschitz gradient on the closure of the convex hull of the trajectory of the iterated sequence of the mode estimate, and also satisfies the Łojasiewicz property there. The main theoretical result of this paper is a generalization of those of \cite{Gh1}, where we show that (1) for\textit{ sufficiently large bandwidth} convergence is guaranteed in any dimension with \textit{any radially symmetric and strictly positive definite kernels}. The proof uses two alternate characterizations of radially symmetric positive definite smooth kernels by Schoenberg and Bernstein \cite{Fass}, and borrows some steps from the proofs in \cite{Gh1}. Although the authors acknowledge that the result in that paper is more restrictive than that of \cite{YT} due to the lower bandwidth limit, it uses a different set of assumptions than \cite{YT}, and the proof technique is different.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Machine Learning

2506.19837

Country: